Picture for Tianyi Xu

Tianyi Xu

OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

Add code
Feb 05, 2026
Viaarxiv icon

MGSM-Pro: A Simple Strategy for Robust Multilingual Mathematical Reasoning Evaluation

Add code
Jan 29, 2026
Viaarxiv icon

SITA: Learning Speaker-Invariant and Tone-Aware Speech Representations for Low-Resource Tonal Languages

Add code
Jan 14, 2026
Viaarxiv icon

AfriqueLLM: How Data Mixing and Model Architecture Impact Continued Pre-training for African Languages

Add code
Jan 10, 2026
Viaarxiv icon

StepORLM: A Self-Evolving Framework With Generative Process Supervision For Operations Research Language Models

Add code
Sep 26, 2025
Viaarxiv icon

Online Learning with Probing for Sequential User-Centric Selection

Add code
Jul 27, 2025
Figure 1 for Online Learning with Probing for Sequential User-Centric Selection
Figure 2 for Online Learning with Probing for Sequential User-Centric Selection
Figure 3 for Online Learning with Probing for Sequential User-Centric Selection
Viaarxiv icon

Fair Algorithms with Probing for Multi-Agent Multi-Armed Bandits

Add code
Jun 17, 2025
Figure 1 for Fair Algorithms with Probing for Multi-Agent Multi-Armed Bandits
Viaarxiv icon

Leveraging LLM and Self-Supervised Training Models for Speech Recognition in Chinese Dialects: A Comparative Analysis

Add code
May 27, 2025
Viaarxiv icon

REAL-Prover: Retrieval Augmented Lean Prover for Mathematical Reasoning

Add code
May 27, 2025
Viaarxiv icon

Large Language Model Compression via the Nested Activation-Aware Decomposition

Add code
Mar 21, 2025
Viaarxiv icon